class: inverse, center, title-slide, middle <style> .title-slide .remark-slide-number { display: none; } </style> # .title-wrap[Intro to Programming with R for Political Scientists] <br /> ## .header-fancy[Session 2: Base R and Tidyverse Basics] ### Markus Freitag ### Geschwister Scholl Institute of Political Science, LMU ### [<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#415564;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg>](https://twitter.com/MarkusGFreitag) [<svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;fill:#415564;" xmlns="http://www.w3.org/2000/svg"> <path d="M336.5 160C322 70.7 287.8 8 248 8s-74 62.7-88.5 152h177zM152 256c0 22.2 1.2 43.5 3.3 64h185.3c2.1-20.5 3.3-41.8 3.3-64s-1.2-43.5-3.3-64H155.3c-2.1 20.5-3.3 41.8-3.3 64zm324.7-96c-28.6-67.9-86.5-120.4-158-141.6 24.4 33.8 41.2 84.7 50 141.6h108zM177.2 18.4C105.8 39.6 47.8 92.1 19.3 160h108c8.7-56.9 25.5-107.8 49.9-141.6zM487.4 192H372.7c2.1 21 3.3 42.5 3.3 64s-1.2 43-3.3 64h114.6c5.5-20.5 8.6-41.8 8.6-64s-3.1-43.5-8.5-64zM120 256c0-21.5 1.2-43 3.3-64H8.6C3.2 212.5 0 233.8 0 256s3.2 43.5 8.6 64h114.6c-2-21-3.2-42.5-3.2-64zm39.5 96c14.5 89.3 48.7 152 88.5 152s74-62.7 88.5-152h-177zm159.3 141.6c71.4-21.2 129.4-73.7 158-141.6h-108c-8.8 56.9-25.6 107.8-50 141.6zM19.3 352c28.6 67.9 86.5 120.4 158 141.6-24.4-33.8-41.2-84.7-50-141.6h-108z"></path></svg>](https://markusfreitag.netlify.app/) ### 2021-07-03 <a href="https://github.com/m-freitag" class="github-corner" aria-label="View source on Github"><svg width="80" height="80" viewBox="0 0 250 250" style="fill:#415564; color:#f6f3f2; position: absolute; top: 0; border: 0; right: 0;" aria-hidden="true"><path d="M0,0 L115,115 L130,115 L142,142 L250,250 L250,0 Z"></path><path d="M128.3,109.0 C113.8,99.7 119.0,89.6 119.0,89.6 C122.0,82.7 120.5,78.6 120.5,78.6 C119.2,72.0 123.4,76.3 123.4,76.3 C127.3,80.9 125.5,87.3 125.5,87.3 C122.9,97.6 130.6,101.9 134.4,103.2" fill="currentColor" style="transform-origin: 130px 106px;" class="octo-arm"></path><path d="M115.0,115.0 C114.9,115.1 118.7,116.5 119.8,115.4 L133.7,101.6 C136.9,99.2 139.9,98.4 142.2,98.6 C133.8,88.0 127.5,74.4 143.8,58.0 C148.5,53.4 154.0,51.2 159.7,51.0 C160.3,49.4 163.2,43.6 171.4,40.1 C171.4,40.1 176.1,42.5 178.8,56.2 C183.1,58.6 187.2,61.8 190.9,65.4 C194.5,69.0 197.7,73.2 200.1,77.6 C213.8,80.2 216.3,84.9 216.3,84.9 C212.7,93.1 206.9,96.0 205.4,96.6 C205.1,102.4 203.0,107.8 198.3,112.5 C181.9,128.9 168.3,122.5 157.7,114.1 C157.9,116.9 156.7,120.9 152.7,124.9 L141.0,136.5 C139.8,137.7 141.6,141.9 141.8,141.8 Z" fill="currentColor" class="octo-body"></path></svg></a><style>.github-corner:hover .octo-arm{animation:octocat-wave 560ms ease-in-out}@keyframes octocat-wave{0%,100%{transform:rotate(0)}20%,60%{transform:rotate(-25deg)}40%,80%{transform:rotate(10deg)}}@media (max-width:500px){.github-corner:hover .octo-arm{animation:none}.github-corner .octo-arm{animation:octocat-wave 560ms ease-in-out}}</style> --- # Overview 1. .hl[Intro] 2. .hl[R-Studio and (Git)Hub] 3. .hl[Base R & Tidyverse Basics] 4. .hl[Data Wrangling I] 5. .hl[Data Wrangling II] 6. .hl[Data Viz] 7. .hl[Writing Functions] 8. .hl[A complete scientific workflow with R] --- # Trivia - R was designed in 1993 by Ross Ihaka and Robert Gentleman - Builds upon the S programming language by John Chambers - Named R as a play on S and bc of the first names of the authors - There are 17788 packages available on [CRAN](https://cran.r-project.org/) as of 2021-07-03. - [R-Studio](https://www.rstudio.com/about/) `\(\neq\)` [R-Core Team](https://www.r-project.org/contributors.html); the former is a mix of a for-profit and a non-profit company; highly committed to produce free & open-source products; has some business solutions <img src="data:image/png;base64,#Figs/devs.png" width="25%" style="display: block; margin: auto;" /> .center[.font50[[Image source and more R-History trivia](https://rss.onlinelibrary.wiley.com/doi/epdf/10.1111/j.1740-9713.2018.01169.x)]] --- # Workflow [//]: # (ANCHOR Workflow) - You forked and cloned the course repo. - Navigate to `Session Scripts > Session 2` and open `Session_2_script.R`. - You will see a pre-formatted Script containing some useful information on comments and formatting. - The script is otherwise empty. Fill it with the stuff I will discuss on the slides. -- - Make comments for yourself. Learn as you write. Explore. Stage, commit and push. - Hopefully not you → <img src= "https://www.cs.cmu.edu/~cangiuli/img/angry.gif" ALIGN=”right”> - If you have a second monitor, great! If not, split your screen. --- class: inverse, center, middle name: intro # CalculatoR --- # CalculatoR ```r 7 + 5 # [n] stands for the nth element printed to the console. ``` ``` ## [1] 12 ``` ```r 4 * 5 + 2 / 3^3 # Multiplication and division first, then addition and subtraction ``` ``` ## [1] 20.07407 ``` -- ```r # Modulo Operators: 10 %/% 3 # Integer division ``` ``` ## [1] 3 ``` ```r 10 %% 3 # Remainder ("Rest") ``` ``` ## [1] 1 ``` --- # CalculatoR .pull-left[ ```r # Relational and logical operators 3 < 4 ``` ``` ## [1] TRUE ``` ```r 2 == 1 & 4 > 2 # == "equal to"; & "element wise logical AND" ``` ``` ## [1] FALSE ``` ```r 2 == 1 | 4 > 2 # | "element wise logical or" ``` ``` ## [1] TRUE ``` ] -- .pull-right[ ```r 3 != 4 # != "not equal" ``` ``` ## [1] TRUE ``` ```r 7 %in% 300:500 # %in% can be used to evaluate matches in objects ``` ``` ## [1] FALSE ``` ```r # Take care about the order of precedence... ``` ] --- # CalculatoR .pull-left[ ```r # Floating Points 0.1 + 0.2 ``` ``` ## [1] 0.3 ``` ```r 0.1 + 0.2 == 0.3 ``` ``` ## [1] FALSE ``` Why?! > [Because internally, computers use a format (binary floating-point) that cannot accurately represent a number like 0.1, 0.2 or 0.3 at all.](https://floating-point-gui.de/basic/) ] .pull-right[ <img src="data:image/png;base64,#Figs/robot.png" width="60%" style="display: block; margin: auto;" /> ] --- class: inverse, center, middle name: intro # A Primer on OOP ("Object Oriented Programming") --- # Object Oriented Programming > Everything is an object and everything has a name. <img src="data:image/png;base64,#Classes1.svg" width="80%" style="display: block; margin: auto;" /> --- # Object Oriented Programming > Everything is an object and everything has a name. <img src="data:image/png;base64,#Classes.svg" width="80%" style="display: block; margin: auto;" /> --- # Functions - [Functions](https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Function-objects) are objects; we will discuss them in more detail in Session x. - For now, just think of them in the usual mathematical sense, where we pass some argument and get back a value. E.g. `\(f(x) = \frac{2x + 3}{\sqrt{3}}\)`. - In R, one or multiple arguments get passed to the function body (where the function is defined) and you get back some results. -- - For instance, to define the above function and call it, we specify the following: ```r f <- function(x) { (2 * x + 3) / sqrt(3) } f(3) ``` ``` ## [1] 5.196152 ``` --- # Functions - There are many functions in R, some are written by users and scientists and put into some package, some are built-in functions of "base" R.<sup>.font70[[1]]</sup> - A really helpful built-in function - fin the literal sense - is the... `help()` function. - Gives you more information about the usage and arguments of a built-in/user-written function. -- We can also call `help()` to get help about help: ```r # help(help) # Or for short: ?help ``` - But let's not get ahead of ourselves... what did this arrow (`<-`) thing do? .font70[<sup>[1]</sup>.hl[Fine Point:] The arithmetic, logical and relational operators we met are actually also function calls.] --- # Making Objects: Assignment (Operators) .pull-left[ - You can use `<-` or `=` for assignment - For instance, ```r a <- 3 # Or a = 3; under the hood, assignment works more like a -> 3 ``` assigns the name `x` to an object of type/mode numeric holding the value 3. I.e. binds an object to a name. Simplification: > creates an object named `a`, containing the value `3`. ] -- .pull-right[ -Using ```r class(a) mode(a) ``` gives you information about the class/type of the object. `class()` gives the class of the object from an OOP POV, `typeof()` (or `mode()`)the base type. - In this case, this is not very interesting as we created an .hl2[atomic] numeric vector. - What is the class/type of `b <- 3 > 2` and `c <- "string"`? ] --- # Making Objects: Assignment (Operators) - Using `=` is legal as per the man, the myth, the legend Ross Ihaka himself: <center><iframe width="560" height="315" src="https://www.youtube.com/embed/88TftllIjaY?start=2100" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe></center> - But there are also some sensible arguments to stick mostly to `<-`in casual settings (e.g. readability: easier to discriminate from function arguments) - Bottom line: .hl[be consistent]. --- # Naming Conventions - For readability, we want names of/bound to objects to be meaningful. - Pretty easy to do when working with "real" data; laziness pretty much the only obstacle ;) - There are several naming conventions. I like `snake_case` and `camelCase` the most. We will use snake cases in this course. - .hl[Be consistent]. - There are some ["forbidden"/"reserved" words](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Reserved.html) that cannot be assigned as names. E.g. `NA` (logical constant indicating missing values), `if`, `else`, `break`. --- # Workspace/Environment - In contrast to Stata, R can hold multiple objects at a time. - This is very convenient; you can copy, modify etc. R also copies .hl2[ALOT] internally (one reason why its sort of a slow language). - The global environment is the interactive workspace you usually work with. - In R-Studio you can inspect some objects by clicking on them (equivalent to calling `View()`) -- - We won't go into the details of environments but see [here](https://adv-r.hadley.nz/environments.html) for an advanced treatment. - This will get more intuitive soon (e.g. when we take on different data structures on the next few slided) --- # Vectors - Vectors are the most fundamental data structure in R. - As vectors in R have to be of the same type (e.g. numeric), they are often called **atomic** vectors - *[Technical Fine Point:](https://adv-r.hadley.nz/vectors-chap.html)* Vectors have attributes, importantly dimension and class. With the dimension attribute, vectors can become arrays and matrices. With the class attribute, we can built an S3 object on top of the base type (see e.g. factor type). -- We can build longer vectors by concenating shorter ones using the `c()` function: ```r chr_var <- c("a", "b", "c", "d") # A 4-element atomic vector of type character ``` Indexing: ```r chr_var[3] # Returns the third element of object "chr_var". chr_var[1:3] # Returns the 1st three. ``` --- # Lists - Objects of type list are highly versatile. They are vectors but more "generic". - I.e. elements of a list can have different types.<sup>.font70[[2]]</sup> For example: ```r list_a <- list( 5:7, "string", c(TRUE, TRUE), c(1.23, 4.20) ) ``` constructs a list using the function `list()`. .font70[<sup>[2]</sup>.hl[Fine Point:] A list does not store objects of different types, it references to them.] --- # Lists - Indexing: .pull-left[ ```r list_a[3] # This returns a list with element four. ``` ``` ## [[1]] ## [1] TRUE TRUE ``` ```r list_a[[4]] # We need double brackets to index a list element. ``` ``` ## [1] 1.23 4.20 ``` ```r # Index within a list element: list_a[[4]][2] ``` ``` ## [1] 4.2 ``` ] .pull-right[ ```r # For named lists, we can index with $: names(list_a) <- c("a", "b", "c", "d") # We can also name elements directly: list(x = 1, y = 2) list_a$b ``` ``` ## [1] "string" ``` ```r list_a$d[1] ``` ``` ## [1] 1.23 ``` ```r # Adding a new object-reference, i.e. an element, to a list: list_a$e <- c("R", "is fun") ``` ] --- # Factors - To represent categorical variables, we often use factor objects in R (e.g. makes it easier to get counts of all categories). - They are build on top of integer vectors and come with a levels attribute. .code70[ ```r backgrounds_char <- c("none", "Stata", "Stata", "Stata", "R") # Student's prog. backgrounds backgrounds_fac <- factor(backgrounds_char, levels = c("none", "Stata", "R")) # Or: # backgrounds_fac <- factor(c("none", "Stata", "Stata", "Stata", "R")) ``` ] .code70[ .pull-left[ ```r class(backgrounds_fac) ``` ``` ## [1] "factor" ``` ] ] .code70[ .pull-right[ ```r typeof(backgrounds_fac) ``` ``` ## [1] "integer" ``` ] ] .code70[ .pull-left[ ```r table(backgrounds_fac) ``` ] ] .code70[ .pull-right[ ``` ## backgrounds_fac ## none Stata R ## 1 3 1 ``` ] ] --- # Data Frames - Data frames are special lists: lists of evenly sized vectors. - You likely already have a grasp for their structure from Stata or other software; they are crucial for data analysis --> the rectangle we love. .code60[ .pull-left[ ```r set.seed(666) df_langs <- data.frame( background = factor(c( "Python", "Stata", "Stata", "Stata", "R" )), skill = rnorm(5) ) # View(df_langs) head(df_langs) # print first few lines of an object (6 by default) # Indexing df_langs[2, 2] # Second row of second column df_langs$background[1] df_langs[df_langs$skill < 0,] # Combined with our operators, we can also filter ``` ] ] .code60[ .pull-right[ ``` ## background skill ## 1 Python 0.7533110 ## 2 Stata 2.0143547 ## 3 Stata -0.3551345 ## 4 Stata 2.0281678 ## 5 R -2.2168745 ``` ``` ## [1] 2.014355 ``` ``` ## [1] Python ## Levels: Python R Stata ``` ``` ## background skill ## 3 Stata -0.3551345 ## 5 R -2.2168745 ``` ] ] --- # Data Frames - So we have an `\(\color{#e59e34}n\)` (num. of obs.) `\(\times\)` `\(\color{#5cb194}k\)` (num. of vars) matrix with observations `\(\color{#e59e34}i \in \{\color{#e59e34}{1,\dots,n}\}\)` and variables `\(\color{#5cb194}j \in \{\color{#5cb194}{1,\dots,k}\}\)`. ```r dim(df_langs) ``` .remark-inline-output[\#\# [1] .hl[5] .hl2[2]] $$ `\begin{align} \mathbf{X} = \begin{bmatrix} x_{\color{#e59e34}{1},\color{#5cb194}{1}} & x_{\color{#e59e34}{1},\color{#5cb194}{2}} & \cdots & x_{\color{#e59e34}{1},\color{#5cb194}{k}} \\ x_{\color{#e59e34}{2},\color{#5cb194}{1}} & x_{\color{#e59e34}{2},\color{#5cb194}{2}} & \cdots & x_{\color{#e59e34}{2},\color{#5cb194}{k}} \\ \vdots & \vdots & \ddots & \vdots \\ x_{\color{#e59e34}{n},\color{#5cb194}{1}} & x_{\color{#e59e34}{n},\color{#5cb194}{2}} & \cdots & x_{\color{#e59e34}{n},\color{#5cb194}{k}} \end{bmatrix} \end{align}` $$ - `nrow(df_langs)` = .remark-inline-output[\#\# [1] .hl[5]]; `ncol(df_langs)` = .remark-inline-output[\#\# [1] .hl2[2]] - `df_langs[1,2]`: `\(\left( \color{#e59e34}{i = 1} \right)\)` and `\(\left( \color{#5cb194}{j = 2} \right)\)` --- # Data Frames - When passing columns of a data frame (or lists etc.) we need need to take care that the named references to them "live" inside the data frame and not in the global environment. .code50[ ```r # Throws an error: lm(skill ~ background) # [3]; the lm object is of type list ``` ] .code50[ ```r m1 <- lm(df_langs$skill ~ df_langs$background) # Or: lm(skill ~ background, data = df_langs) summary(m1) # generic function used to produce result summaries of the results of various model fitting functions ``` ``` ## ## Call: ## lm(formula = df_langs$skill ~ df_langs$background) ## ## Residuals: ## 1 2 3 4 5 ## 1.986e-16 7.852e-01 -1.584e+00 7.990e-01 -3.827e-16 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 0.7533 1.3720 0.549 0.638 ## df_langs$backgroundR -2.9702 1.9403 -1.531 0.265 ## df_langs$backgroundStata 0.4758 1.5843 0.300 0.792 ## ## Residual standard error: 1.372 on 2 degrees of freedom ## Multiple R-squared: 0.7056, Adjusted R-squared: 0.4113 ## F-statistic: 2.397 on 2 and 2 DF, p-value: 0.2944 ``` ] .font70[<sup>[3]</sup>.hl[NOTE:] The formula object/class is built on the base type "language" and important for many stats applications.] --- # Data Frames - This is the time to also plug Vincent Arel-Bundocks' nice package [modelsummary](https://github.com/vincentarelbundock/modelsummary). - Makes perfectly formated regression tables/coef. plots for HTML/MD/LaTeX/Word/etc. a one-liner. - Also has nice data summary functions which we will use next session. --- # Data Frames Result: <table class="table table" style="width: auto !important; margin-left: auto; margin-right: auto; font-size: 19px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:center;"> Model 1 </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> (Intercept) </td> <td style="text-align:center;"> 0.753 (1.372) </td> </tr> <tr> <td style="text-align:left;"> Python (reference) </td> <td style="text-align:center;"> NA </td> </tr> <tr> <td style="text-align:left;"> R </td> <td style="text-align:center;"> -2.970 (1.940) </td> </tr> <tr> <td style="text-align:left;box-shadow: 0px 1px"> Stata </td> <td style="text-align:center;box-shadow: 0px 1px"> 0.476 (1.584) </td> </tr> <tr> <td style="text-align:left;"> Num.Obs. </td> <td style="text-align:center;"> 5 </td> </tr> <tr> <td style="text-align:left;"> R2 </td> <td style="text-align:center;"> 0.706 </td> </tr> <tr> <td style="text-align:left;"> R2 Adj. </td> <td style="text-align:center;"> 0.411 </td> </tr> <tr> <td style="text-align:left;"> AIC </td> <td style="text-align:center;"> 20.8 </td> </tr> <tr> <td style="text-align:left;"> BIC </td> <td style="text-align:center;"> 19.2 </td> </tr> <tr> <td style="text-align:left;"> Log.Lik. </td> <td style="text-align:center;"> -6.385 </td> </tr> <tr> <td style="text-align:left;"> F </td> <td style="text-align:center;"> 2.397 </td> </tr> </tbody> </table> --- # Type Conversion In most cases, we can also convert an object to another type. Sometimes this has side-effects. .pull-left[.code70[ ```r df_langs_m <- as.matrix(df_langs) # convert data.frame to matrix print(df_langs_m) ``` ``` ## background skill ## [1,] "Python" " 0.7533110" ## [2,] "Stata" " 2.0143547" ## [3,] "Stata" "-0.3551345" ## [4,] "Stata" " 2.0281678" ## [5,] "R" "-2.2168745" ``` ] ] .pull-right[.code70[ ```r typeof(df_langs_m[2, 1]) ``` ``` ## [1] "character" ``` ] ] - As matrices are atomic, an implicit coercion happened.<sup>.font70[[4]]</sup> - If stuff cannot get converted, you get NAs. .font70[<sup>[4]</sup>.hl[Fine Point:] Coercion order is character > double > integer > logical.] --- # Loading/Installing Packages Classic: ```r install.packages("tidyverse") library(tidyverse) ``` If you are lazy (...for CRAN packages; recommended only for smaller projects/problem sets): ```r install.packages("pacman") pacman::p_load( tidyverse, patchwork, rio, data.table ) ``` --- # Loading/Installing Packages - As mentioned, R comes with 17788 packages on CRAN up to date. - This is both a blessing and a [curse](https://en.wikipedia.org/wiki/Dependency_hell). - Many useful packages also live only on Github. - Escaping package/library dependency hell is a bit of a pain in R (much easier in Python and Julia).<sup>[5]</sup> - To take it to another level, [Dockers](https://environments.rstudio.com/docker) expand reproducibility by providing a full runtime environment (and they work well with R). .font70[<sup>[5]</sup>.hl[NOTE:] The [packrat](https://github.com/rstudio/packrat) package is one option. [renv](https://rstudio.github.io/renv/articles/renv.html) is much better. You should definitely use it if you work on a larger project.] --- # Loading/Installing Packages: Namespace Conflicts - As we assign names to objects and there are as many packages (and many more functions) as a native english speaker has in his [vocabulary](https://en.wikipedia.org/wiki/Vocabulary#Vocabulary_size), we naturally get some overlap. - Diving deep into environments, package [namespaces](https://r-pkgs.org/namespace.html), etc. can take you down a [rabbit hole](http://blog.obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/). - We will keep this very surface-level. - R warns you about namespace conflicts when loading some package, e.g. by telling you that some objects are `masked from package: base`. - The package loaded last "wins", i.e. function `a` of package 2 masks function `a` of package 1. - If you only need the mask function a few times, use `package1::functiona()`. Else, overwrite it `functiona <- package1::functiona()`. --- # The Tidyverse > "The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures." [tidyverse website](https://www.tidyverse.org/) - Developed by [Hadley Wickham](https://en.wikipedia.org/wiki/Hadley_Wickham), supported by RStudio - Packages that are easy to learn, that have a similar syntax, and are convenient - There are alternatives... base R is still stable and useful; data.table is fast; [collapse](https://raw.githubusercontent.com/SebKrantz/cheatsheets/master/collapse.pdf)(brand new) is even faster - In my opinion, the tidyverse is a bit verbose/wordy. Alot of functions doing one particular thing. - You will always end up googling alot. But that's something you will do alot anyways (I do too!). --- # Session 2 - Problem Set --- class: inverse, center, middle name: intro # Next up: Data Wrangling